M-ary predictive coding: a nonlinear model for speech
نویسنده
چکیده
Speech Coding is pivotal in the ability of networks to support multimedia services. The technique currently used for speech coding is Linear Prediction. It models the throat as an all-pole filter i.e. using a linear difference equation. However, the physical nature of the throat is itself a clue to its nonlinear nature. Developing a nonlinear model is difficult as in the solution of nonlinear equations and the verification of nonlinear schemes. In this paper, two nonlinear models– Quadratic Predictive Coding and M-ary Predictive Coding-have been proposed. The equations for the models were developed and the coefficients solved for. MATLAB was used for implementation and testing. This paper gives the equations and preliminary results. The proposed enhancements to MPC are also discussed in brief. 1. AIM The objective of this endeavour was to i. Investigate the two nonlinear models proposed by the author for more effective speech coding. ii. Development of one of the two models – MPC (M-ary Predictive Coding) – as a complete coding scheme. 2. INTRODUCTION Speech is a very special signal for a variety of reasons. The most preliminary of these is the fact that speech is a non-stationary signal. This makes speech a rather tough signal to analyze and model. The second reason, is that factors like the intelligibility, the coherence and other such 'human' characteristics play a vital role in speech analysis as against statistical parameters like Mean Square Error. The third reason is from a communications point of view. The number of discrete values required to describe one second of speech amounts to 8000 (at the minimum). Due to concerns of bandwidth, compression is desirable. In the technique of linear prediction, the vocal tract is modeled as an all-pole filter[1]. That is, the transfer function of the vocal throat is written as i=1 In terms of a difference equation, p y[n]= Σa i *y[n-i] i=1
منابع مشابه
Nonlinear speech coding model based on genetic programming
An improved genetic programming is proposed in this paper to construct the nonlinear models of speech signals, and the speech coding is further accomplished. After the preprocessing of the speech signals, the improved GP is used to construct the corresponding model of each speech frame. Then by analyzing these models, a normalized model that has generalization ability is obtained. And finally t...
متن کاملImproved Speech Coding Based on Open-Loop Parameter Estimation
A nonlinear optimization algorithm for linear predictive speech coding was developed early that not only optimizes the linear model coefficients for the open loop predictor, but does the optimization including the effects of quantization of the transmitted residual. It also simultaneously optimizes the quantization levels used for each speech segment. In this paper, we present an improved metho...
متن کاملDiscriminative Training for Neural Predictive Coding Applied to Speech Features Extraction
In this paper, we present a predictive neural network called Neural Predictive Coding (NPC). This model is used for non linear discriminant features extraction (DFE) applied to phoneme recognition. We validate the nonlinear prediction improvement of the NPC model. We also, present a new extension of the NPC model : NPC-3. In order to evaluate the performances of the NPC-3 model, we carried out ...
متن کاملFully vector-quantized neural network-based code-excited nonlinear predictive speech coding
I Recent studies have shown that non-linear prediction can be implemented with neural networks, and non-linear predictors will on average achieve about 2 3 improvement in prediction gain over conventional linear predictors. In this paper, we take the advantage of non-linear prediction with neural network, apply it to predictive speech coding and attempt to improve the speech coding performance....
متن کاملPredictive vector quantization using the M-algorithm for distributed speech recognition
In this paper we present a predictive vector quantizer for distributed speech recognition that makes use of a delayed decision coding scheme, performing the optimal codeword searching by means of the M-algorithm. In single-path predictive vector quantization coders, each frame is coded with the closest codeword to the prediction error. However, prediction errors and quantization errors of futur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003